Wazuh-agent troubleshooting guide.

If you see this error in kibana on an agent. It could be for a number of reasons.

Follow this process to figure it out.

  • Agent buffer on the client is full, which is caused by flood of alerts. The agents have a buffer size to keep resources on the clients consistent and minimal. If this fills up then kibana will stop collecting data.
  • The first step is the easiest log into the client and restart the client by
    • Systemctl restart wazuh-agent
    • /etc/init.d/wazuh-agent restart
    • And windows open the agent and click on restart
  • If you go kibana
    • Click on agents
    • Then find your agent
    • Click on a agent
    • Click security audit

It should look something like this.

 

 

If this does not appear then we need to check wazuh-manager

Reason1 :Space issues

Logs can stop generating if elastic-search partition reaches 85% full and put the manager into read only mode.

# ls /usr/share/elasticsearch/data/ (lives on a different lvm)

# ls /var/ossec (lives on a different lvm)

 

    • Ensure these partitions have plenty of space or wazuh will go into read only mode
    • Ones you have ensure there is adequate space you will need to execute a command in kibana to get it working again.

 

PUT _settings{    “index : {        blocks.read_only : “false”    }}

 

    • In kibana, go to dev tools and put the above code and play the code.

 

Alternative command that does the same thing.

  • curl XPUT http://localhost:9200/_settings ‘Content-Type: application/json’ d‘ { “index”: { “blocks”: { “read_only_allow_delete“: “false” } } } ‘

 

  • Next restart wazuh-manager and ossec
    • /var/ossec/bin/ossec-control restart
    • Systemctl restart wazuh-manager

Reason 2: Ensure services are running and check versions

  • Elasticsearch:curl XGET ‘localhost:9200’

[root@waz01~]# curl localhost:9200/_cluster/health?pretty

{

cluster_name” :elasticsearch“,

“status” : “yellow”,

timed_out” : false,

number_of_nodes” : 1,

number_of_data_nodes” : 1,

active_primary_shards” : 563,

active_shards” : 563,

relocating_shards” : 0,

initializing_shards” : 0,

unassigned_shards” : 547,

delayed_unassigned_shards” : 0,

number_of_pending_tasks” : 0,

number_of_in_flight_fetch” : 0,

task_max_waiting_in_queue_millis” : 0,

active_shards_percent_as_number” : 50.72072072072073

}

  • Kibana:/usr/share/kibana/bin/kibana V

[root@waz01 ~]# /usr/share/kibana/bin/kibana -V

6.4.0Logstash:/usr/share/logstash/bin/logstash V

[root@waz01 ~]# /usr/share/logstash/bin/logstash -V

logstash 6.4.2

 

    • Check to see if wazuh-manager and logstash are running
  • systemctl status wazuhmanager

 

Active and working

[root@waz01 ~]#systemctl status wazuh-manager

wazuh-manager.serviceWazuh manager

Loaded: loaded (/etc/systemd/system/wazuh-manager.service; enabled; vendor preset: disabled)

Active: active (running) since Thu 2018-10-18 12:25:53 BST; 4 days ago

Process: 4488 ExecStop=/usr/bin/env ${DIRECTORY}/bin/ossec-control stop (code=exited, status=0/SUCCESS)

Process: 4617 ExecStart=/usr/bin/env ${DIRECTORY}/bin/ossec-control start (code=exited, status=0/SUCCESS)

CGroup: /system.slice/wazuh-manager.service

├─4635 /var/ossec/bin/ossec-authd

├─4639 /var/ossec/bin/wazuh-db

├─4656 /var/ossec/bin/ossec-execd

├─4662 /var/ossec/bin/ossec-analysisd

├─4666 /var/ossec/bin/ossec-syscheckd

├─4672 /var/ossec/bin/ossec-remoted

├─4675 /var/ossec/bin/ossec-logcollector

├─4695 /var/ossec/bin/ossec-monitord

└─4699 /var/ossec/bin/wazuh-modulesd

 

Oct 18 12:25:51 waz01env[4617]: Started wazuh-db…

Oct 18 12:25:51 waz01env[4617]: Started ossec-execd

Oct 18 12:25:51 waz01env[4617]: Started ossec-analysisd

Oct 18 12:25:51 waz01env[4617]: Started ossec-syscheckd

Oct 18 12:25:51 waz01env[4617]: Started ossec-remoted…

Oct 18 12:25:51 waz01env[4617]: Started ossec-logcollector

Oct 18 12:25:51 waz01env[4617]: Started ossec-monitord

Oct 18 12:25:51 waz01env[4617]: Started wazuh-modulesd

Oct 18 12:25:53 waz01env[4617]: Completed.

Oct 18 12:25:53 waz01systemd[1]: Started Wazuh manager.

  • systemctl status logstash

Active and working

[root@waz01~]#systemctl status logstash

logstash.servicelogstash

Loaded: loaded (/etc/systemd/system/logstash.service; enabled; vendor preset: disabled)

Active: active (running) since Mon 2018-10-15 23:44:21 BST; 1 weeks 0 days ago

Main PID: 11924 (java)

CGroup: /system.slice/logstash.service

└─11924 /bin/java -Xms1g -Xmx1g –XX:+UseParNewGC -XX:+UseConcMarkSweepGCXX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnlyDjava.awt.headless=true –Dfile.encoding=UTF-8 –Djruby.compile.invokedynamic=true –Djruby.jit.threshold=0 -XX:+HeapDumpOnOutOfMemoryErrorDjava.security.egd=file:/dev/urandom -cp /usr/share/logstash/logstash-core/lib/jars/animal-sniffer-annotations-1.14.jar:/usr/share/logstash/logstash-core/lib/jars/commons-codec-1.11.jar:/u…

 

Oct 15 23:44:41 waz01logstash[11924]: [2018-10-15T23:44:41,581][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won’t be used to determine the document _type {:es_version=>6}

Oct 15 23:44:41 waz01logstash[11924]: [2018-10-15T23:44:41,604][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>”LogStash::Outputs::ElasticSearch“, :hosts=>[“//localhost:9200”]}

Oct 15 23:44:41 waz01logstash[11924]: [2018-10-15T23:44:41,616][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}

Oct 15 23:44:41 waz01logstash[11924]: [2018-10-15T23:44:41,641][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{“template”=>”logstash-*”, “version”=>60001, “settings”=>{“index.refresh_interval“=>”5s”}, “mappings”=>{“_default_”=>{“dynamic_templates”=>[{“message_field”=>{“path_match”=>”mess

Oct 15 23:44:41 waz01logstash[11924]: [2018-10-15T23:44:41,662][INFO ][logstash.filters.geoip ] Using geoip database {:path=>”/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-geoip-5.0.3-java/vendor/GeoLite2-City.mmdb”}

Oct 15 23:44:41 waz01logstash[11924]: [2018-10-15T23:44:41,925][INFO ][logstash.inputs.file ] No sincedb_path set, generating one based on the “path” setting {:sincedb_path=>”/var/lib/logstash/plugins/inputs/file/.sincedb_b6991da130c0919d87fbe36c3e98e363″, :path=>[“/var/ossec/logs/alerts/alerts.json“]}

Oct 15 23:44:41 waz01logstash[11924]: [2018-10-15T23:44:41,968][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>”main”, :thread=>”#<Thread:0x63e37301 sleep>”}

Oct 15 23:44:42 waz01logstash[11924]: [2018-10-15T23:44:42,013][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

Oct 15 23:44:42 waz01logstash[11924]: [2018-10-15T23:44:42,032][INFO ][filewatch.observingtail ] START, creating Discoverer, Watch with file and sincedb collections

Oct 15 23:44:42 waz01logstash[11924]: [2018-10-15T23:44:42,288][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}

 

If any of these are failed restart them.

  • systemctl restart logstashsystemctl restart elasticsearchsystemctl restart wazuh-manger

Reason 3: Logstash is broken

  • Check the logs for errors.
  • tail /var/log/logstash/logstash-plain.log

 

Possible error#1 :

[root@waz01 ~]# tail /var/log/logstash/logstash-plain.log

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({“type”=>”cluster_block_exception“, “reason”=>”blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”})

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({“type”=>”cluster_block_exception“, “reason”=>”blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”})

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>2}

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({“type”=>”cluster_block_exception“, “reason”=>”blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”})

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>2}

[2018-10-09T17:37:59,475][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>3}

[2018-10-09T17:37:59,476][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({“type”=>”cluster_block_exception“, “reason”=>”blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”})

[2018-10-09T17:37:59,476][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}

 

 

Possible error#2 :

 

[2018-10-15T20:06:10,967][ERROR][org.logstash.Logstash ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

[2018-10-15T20:06:26,863][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<ArgumentError: Path “/var/lib/logstash/queue” must be a writable directory. It is not writable.>, :backtrace=>[“/usr/share/logstash/logstash-core/lib/logstash/settings.rb:447:in `validate'”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:229:in `validate_value‘”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:140:in `block in validate_all‘”, “org/jruby/RubyHash.java:1343:in `each'”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:139:in `validate_all‘”, “/usr/share/logstash/logstash-core/lib/logstash/runner.rb:278:in `execute'”, “/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:67:in `run'”, “/usr/share/logstash/logstash-core/lib/logstash/runner.rb:237:in `run'”, “/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:132:in `run'”, “/usr/share/logstash/lib/bootstrap/environment.rb:73:in `<main>'”]}

[2018-10-15T20:06:26,878][ERROR][org.logstash.Logstash ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

[2018-10-15T20:06:42,543][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<ArgumentError: Path “/var/lib/logstash/queue” must be a writable directory. It is not writable.>, :backtrace=>[“/usr/share/logstash/logstash-core/lib/logstash/settings.rb:447:in `validate'”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:229:in `validate_value‘”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:140:in `block in validate_all‘”, “org/jruby/RubyHash.java:1343:in `each'”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:139:in `validate_all‘”, “/usr/share/logstash/logstash-core/lib/logstash/runner.rb:278:in `execute'”, “/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:67:in `run'”, “/usr/share/logstash/logstash-core/lib/logstash/runner.rb:237:in `run'”, “/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:132:in `run'”, “/usr/share/logstash/lib/bootstrap/environment.rb:73:in `<main>'”]}

[2018-10-15T20:06:42,557][ERROR][org.logstash.Logstash ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

[2018-10-15T20:06:58,344][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<ArgumentError: Path “/var/lib/logstash/queue” must be a writable directory. It is not writable.>, :backtrace=>[“/usr/share/logstash/logstash-core/lib/logstash/settings.rb:447:in `validate'”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:229:in `validate_value‘”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:140:in `block in validate_all‘”, “org/jruby/RubyHash.java:1343:in `each'”, “/usr/share/logstash/logstash-core/lib/logstash/settings.rb:139:in `validate_all‘”, “/usr/share/logstash/logstash-core/lib/logstash/runner.rb:278:in `execute'”, “/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:67:in `run'”, “/usr/share/logstash/logstash-core/lib/logstash/runner.rb:237:in `run'”, “/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:132:in `run'”, “/usr/share/logstash/lib/bootstrap/environment.rb:73:in `<main>'”]}

[2018-10-15T20:06:58,359][ERROR][org.logstash.Logstash ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exi

 

Probably need to reinstall logstash

 

1. Stop affected services:

 

# systemctl stop logstash# systemctl stop filebeat (this should not be installed on a stand alone setup as it causes performance issues.

 

2. Remove Filebeat

 

# yum remove filebeat

 

 

3. Setting up Logstash

# curl -so /etc/logstash/conf.d/01-wazuh.conf https://raw.githubusercontent.com/wazuh/wazuh/3.6/extensions/logstash/01-wazuh-local.conf# usermod -a -G osseclogstash

 

  • Next step is to correct folder owner for certain Logstash directories:

 

# chown -R logstash:logstash /usr/share/logstash# chown -R logstash:logstash /var/lib/logstash

 

Note: if logstash still shows writing issues in the logs increase the permissions to

 

  • chmod -R 766 /usr/share/logstash
  • systemctl restart logstash
  •  

 

Now restart Logstash:

 

# systemctl restart logstash

 

 

5. Restart Logstash & run the curl command to ensure its not readonly.

 

  • # systemctl restart logstash

 

  • curl XPUT http://localhost:9200/_settings ‘Content-Type: application/json’ d‘ { “index”: { “blocks”: { “read_only_allow_delete“: “false” } } } ‘
  • 6. Now check again your Logstash log file:

6. Now check again your Logstash log file:

 

# cat /var/log/logstash/logstash-plain.log | grep –i -E “(error|warning|critical)”

 

Hopefully you see no errors being generated

 

Next check the plain log

  • tail -10 /var/log/logstash/logstash-plain.log

 

Good log output:

[root@waz01~]# tail -10 /var/log/logstash/logstash-plain.log

[2018-10-15T23:44:41,581][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won’t be used to determine the document _type {:es_version=>6}

[2018-10-15T23:44:41,604][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>”LogStash::Outputs::ElasticSearch“, :hosts=>[“//localhost:9200”]}

[2018-10-15T23:44:41,616][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}

[2018-10-15T23:44:41,641][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{“template”=>”logstash-*”, “version”=>60001, “settings”=>{“index.refresh_interval“=>”5s”}, “mappings”=>{“_default_”=>{“dynamic_templates”=>[{“message_field”=>{“path_match”=>”message”, “match_mapping_type“=>”string”, “mapping”=>{“type”=>”text”, “norms”=>false}}}, {“string_fields“=>{“match”=>”*”, “match_mapping_type“=>”string”, “mapping”=>{“type”=>”text”, “norms”=>false, “fields”=>{“keyword”=>{“type”=>”keyword”, “ignore_above“=>256}}}}}], “properties”=>{“@timestamp”=>{“type”=>”date”}, “@version”=>{“type”=>”keyword”}, “geoip“=>{“dynamic”=>true, “properties”=>{“ip“=>{“type”=>”ip“}, “location”=>{“type”=>”geo_point“}, “latitude”=>{“type”=>”half_float“}, “longitude”=>{“type”=>”half_float“}}}}}}}}

[2018-10-15T23:44:41,662][INFO ][logstash.filters.geoip ] Using geoip database {:path=>”/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-geoip-5.0.3-java/vendor/GeoLite2-City.mmdb”}

[2018-10-15T23:44:41,925][INFO ][logstash.inputs.file ] No sincedb_path set, generating one based on the “path” setting {:sincedb_path=>”/var/lib/logstash/plugins/inputs/file/.sincedb_b6991da130c0919d87fbe36c3e98e363″, :path=>[“/var/ossec/logs/alerts/alerts.json“]}

[2018-10-15T23:44:41,968][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>”main”, :thread=>”#<Thread:0x63e37301 sleep>”}

[2018-10-15T23:44:42,013][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[2018-10-15T23:44:42,032][INFO ][filewatch.observingtail ] START, creating Discoverer, Watch with file and sincedb collections

[2018-10-15T23:44:42,288][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}

 

Now that we have all clear, let’s check component by component:

 

1. Check last 10 alerts generated in your Wazuh manager. Also, check the field timestamp, we must take care about the timestamp.

 

tail 10 /var/ossec/logs/alerts/alerts.json

2. If the Wazuh manager is generating alerts from your view (step 1), then let’s check if Logstash is reading our alerts. You should see two processes: java for Logstash and ossec-ana from Wazuh.

 

# lsof /var/ossec/logs/alerts/alerts.json (ossec-ana & java should be running if not restart ossec)

[root@waz01~]#lsof /var/ossec/logs/alerts/alerts.json

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME

ossec-ana 4662ossec 10w REG 253,3 2060995503 201341089 /var/ossec/logs/alerts/alerts.json

java 11924 logstash 93r REG 253,3 2060995503 201341089 /var/ossec/logs/alerts/alerts.json

3. If Logstash is reading our alerts, let’s check if there is an Elasticsearch index for today (wazuh-alerts-3.x-2018.10.16)):

 

curl localhost:9200/_cat/indices/wazuhalerts3.x-*

 

[root@waz01~]# curl localhost:9200/_cat/indices/wazuh-alerts-3.x-*

yellow open wazuh-alerts-3.x-2018.09.07 HLNDuMjHS1Ox3iLoSwFE7g 5 1 294 0 1000.8kb 1000.8kb

yellow open wazuh-alerts-3.x-2018.09.25 Eg1rvDXbSNSq5EqJAtSm_A 5 1 247998 0 87.7mb 87.7mb

yellow open wazuh-alerts-3.x-2018.09.05 HHRnxqjtTKimmW6FEUUfdw 5 1 143 0 679.6kb 679.6kb

yellow open wazuh-alerts-3.x-2018.09.08 MqIJtCNQR3aU3inuv-pxpw 5 1 183 0 748kb 748kb

yellow open wazuh-alerts-3.x-2018.09.15 GIx8fMXnQ3ukrSkKmjbViQ 5 1 171191 0 45.9mb 45.9mb

yellow open wazuh-alerts-3.x-2018.10.10 W3pw1hDwSp2QAtRm0hwoaQ 5 1 896799 0 662.6mb 662.6mb

yellow open wazuh-alerts-3.x-2018.10.15 rnC7kyXRQSCSXm6wVCiWOw 5 1 2628257 0 1.8gb 1.8gb

yellow open wazuh-alerts-3.x-2018.10.02 nKEdjkFOQ9abitVi_dKF3g 5 1 727934 0 232.7mb 232.7mb

yellow open wazuh-alerts-3.x-2018.09.21 FY0mIXGQQHmCpYgRgOIJhg 5 1 203134 0 63.5mb 63.5mb

yellow open wazuh-alerts-3.x-2018.10.01 mvYSVDZJSfa-F_5dKIBwAg 5 1 402155 0 129.9mb 129.9mb

yellow open wazuh-alerts-3.x-2018.10.18 _2WiGz6fRXSNyDjy8qPefg 5 1 2787147 0 1.8gb 1.8gb

yellow open wazuh-alerts-3.x-2018.09.19 ebb9Jrt1TT6Qm6df7VjZxg 5 1 201897 0 58.3mb 58.3mb

yellow open wazuh-alerts-3.x-2018.09.13 KPy8HfiyRyyPeeHpTGKJNg 5 1 52530 0 13.7mb 13.7mb

yellow open wazuh-alerts-3.x-2018.10.23 T7YJjWhgRMaYyCT-XC1f5w 5 1 1074081 0 742.6mb 742.6mb

yellow open wazuh-alerts-3.x-2018.10.03 bMW_brMeRkSDsJWL6agaWg 5 1 1321895 0 715mb 715mb

yellow open wazuh-alerts-3.x-2018.09.18 B1wJIN1SQKuSQbkoFsTmnA 5 1 187805 0 52.4mb 52.4mb

yellow open wazuh-alerts-3.x-2018.09.04 CvatsnVxTDKgtPzuSkebFQ 5 1 28 0 271.1kb 271.1kb

yellow open wazuh-alerts-3.x-2018.10.21 AWVQ7D8VS_S0DHiXvtNB1Q 5 1 2724453 0 1.8gb 1.8gb

yellow open wazuh-alerts-3.x-2018.09.27 8wRF0XhXQnuVexAxLF6Y5w 5 1 233117 0 79.2mb 79.2mb

yellow open wazuh-alerts-3.x-2018.10.13 wM5hHYMCQsG5XCkIquE-QA 5 1 304830 0 222.4mb 222.4mb

yellow open wazuh-alerts-3.x-2018.09.12 1aB7pIcnTWqZPZkFagHnKA 5 1 73 0 516kb 516kb

yellow open wazuh-alerts-3.x-2018.09.29 BXyZe2eySkSlwutudcTzNA 5 1 222734 0 73.7mb 73.7mb

yellow open wazuh-alerts-3.x-2018.10.04 x8198rpWTxOVBgJ6eTjJJg 5 1 492044 0 364.9mb 364.9mb

yellow open wazuh-alerts-3.x-2018.09.23 ZQZE9KD1R1y6WypYVV5kfg 5 1 216141 0 73.7mb 73.7mb

yellow open wazuh-alerts-3.x-2018.09.22 60AsCkS-RGG0Z2kFGcrbxg 5 1 218077 0 74.2mb 74.2mb

yellow open wazuh-alerts-3.x-2018.10.12 WdiFnzu7QlaBetwzcsIFYQ 5 1 363029 0 237.7mb 237.7mb

yellow open wazuh-alerts-3.x-2018.09.24 Loa8kM7cSJOujjRzvYsVKw 5 1 286140 0 106.3mb 106.3mb

yellow open wazuh-alerts-3.x-2018.09.17 zK3MCinOSF2_3rNAJnuPCQ 5 1 174254 0 48.3mb 48.3mb

yellow open wazuh-alerts-3.x-2018.10.17 A4yCMv4YTuOQWelbb3XQtQ 5 1 2703251 0 1.8gb 1.8gb

yellow open wazuh-alerts-3.x-2018.09.02 lt8xvq2ZRdOQGW7pSX5-wg 5 1 148 0 507kb 507kb

yellow open wazuh-alerts-3.x-2018.08.31 RP0_5r1aQdiMmQYeD0-3CQ 5 1 28 0 247.8kb 247.8kb

yellow open wazuh-alerts-3.x-2018.09.28 iZ2J4UMhR6y1eHH1JiiqLQ 5 1 232290 0 78.6mb 78.6mb

yellow open wazuh-alerts-3.x-2018.09.09 FRELA8dFSWy6aMd12ZFnqw 5 1 428 0 895.1kb 895.1kb

yellow open wazuh-alerts-3.x-2018.09.16 uwLNlaQ1Qnyp2V9jXJJHvA 5 1 171478 0 46.5mb 46.5mb

yellow open wazuh-alerts-3.x-2018.10.14 WQV3dpLeSdapmaKOewUh-Q 5 1 226964 0 154.9mb 154.9mb

yellow open wazuh-alerts-3.x-2018.09.11 2Zc4Fg8lR6G64XuJLZbkBA 5 1 203 0 772.1kb 772.1kb

yellow open wazuh-alerts-3.x-2018.10.16 p2F-trx1R7mBXQUb4eY-Fg 5 1 2655690 0 1.8gb 1.8gb

yellow open wazuh-alerts-3.x-2018.08.29 kAPHZSRpQqaMhoWgkiXupg 5 1 28 0 236.6kb 236.6kb

yellow open wazuh-alerts-3.x-2018.08.28 XmD43PlgTUWaH4DMvZMiqw 5 1 175 0 500.9kb 500.9kb

yellow open wazuh-alerts-3.x-2018.10.19 O4QFPk1FS1urV2CGM2Ul4g 5 1 2718909 0 1.8gb 1.8gb

 

4. If Elasticsearch has an index for today (wazuh-alerts-3.x-2018.10.16), the problem is probably selected time range in Kibana. To discard any error related to this, please go to Kibana > Discover, and look for

alerts in that section of Kibana itself. If there are alerts from today in the Discover section.

  •  

 

  •  

 

  •  

 

 

This means the Elasticsearch stack is finally working (at least at index level)

 

Reason 4: Agent buffer is full due to flood events. If this occurs events are not logged and data is lost. We want to drill down on a specific agent to figure out what is causing the issue.

Try to fetch data directly from Elasticsearch for the today’s index and for the agent 013. Copy and paste the next query in the Kibana dev tools:

 

GET wazuhalerts3.x2018.10.17/_search{  “query”: {    “match”: {      agent.id: “013”    }  }}

 

 

This should provide a log an output to show that the agent is logged in the indices for that day. If this is successful then we know that the logs are coming and kibana is able to communicate.

 

Next steps

  • Login using SSH into the agent “013” and execute the next command:

 

wc /var/log/audit/audit.log  cut d‘/’ f1 (centos)

wc /var/log/audit/syslog  cut d‘/’ f1(ubuntu)

 

  • root@wazuh-03:/var/log# wc -l /var/log/syslog | cut -d’/’ -f1
  • 36451 

36451 

 

Also, it would be nice if you provide us your audit rules, let’s check them using the next command:

 

# auditctl -l

 

 

It should show you a positive number, and that number is the number of lines in the audit.log file. Note down it.

 

  • Now restart the Wazuh agent:

 

# systemctl restart wazuh-agent

 

We need to wait for syscheck scan is finished, this trick is useful to know exactly when it’s done:

 

# tail -f /var/ossec/logs/ossec.log | grep syscheck | grep Ending

The above command shouldn’t show anything until the scan is finished (it could take some time, be patient please). At the end, you should see a line like this:

 

2018/10/17 13:36:03 ossecsyscheckd: INFO: Ending syscheck scan (forwarding database).

Now, it’s time for checking the audit.log file again:

 

wc /var/log/audit/audit.log  cut d‘/’ f1

wc /var/log/audit/syslog  cut d‘/’ f1

If you still see the agent buffer full after these steps then we need to do debugging.

tail -f /var/ossec/logs/ossec.log | grep syscheck | grep Ending

root@waz03:/var/log# cat /var/ossec/logs/ossec.log | grep –i -E “(error|warning|critical)”

2018/10/17 00:09:08 ossec-agentd: WARNING: Agent buffer at 90 %.

2018/10/17 00:09:08 ossec-agentd: WARNING: Agent buffer is full: Events may be lost.

2018/10/17 12:10:20 ossec-agentd: WARNING: Agent buffer at 90 %.

2018/10/17 12:10:20 ossec-agentd: WARNING: Agent buffer is full: Events may be lost.

2018/10/17 14:25:20 ossec-logcollector: ERROR: (1103): Could not open file ‘/var/log/messages’ due to [(2)-(No such file or directory)].

2018/10/17 14:25:20 ossec-logcollector: ERROR: (1103): Could not open file ‘/var/log/secure’ due to [(2)-(No such file or directory)].

2018/10/17 14:26:08 ossec-agentd: WARNING: Agent buffer at 90 %.

2018/10/17 14:26:08 ossec-agentd: WARNING: Agent buffer is full: Events may be lost.

2018/10/17 14:28:18 ossec-logcollector: ERROR: (1103): Could not open file ‘/var/log/messages’ due to [(2)-(No such file or directory)].

2018/10/17 14:28:18 ossec-logcollector: ERROR: (1103): Could not open file ‘/var/log/secure’ due to [(2)-(No such file or directory)].

2018/10/17 14:29:06 ossec-agentd: WARNING: Agent buffer at 90 %.

2018/10/17 14:29:06 ossec-agentd: WARNING: Agent buffer is full: Events may be lost.

 

Debugging json alerts for specific agent 13

Ok, let’s debug your agent events using logall_json in the Wazuh manager instance.

 

Login using SSH into the Wazuh manager instance and edit the ossec.conf file.

  • Edit the file /var/ossec/etc/ossec.conf and look for the <global> section, then enable <logall_json>

 

<logall_json>yes</logall_json>

 

2. Restart the Wazuh manager

 

# systemctl restart wazuh-manager

3. Login using SSH into the Wazuhagent(13) instance, restart it and tail -f until it shows you the warning message:

 

# systemctl restart wazuh-agent# tail -f /var/ossec/logs/ossec.log | grep WARNING

4. Once you see ossec-agentd: WARNING: Agent buffer at 90 %. in the Wazuh agent logs, 

    then switch your CLI to the Wazuh manager instance again and

    the next file we want to tail is from your Wazuh manager:

 

tail –f /var/ossec/logs/archives/archives.json

 

5. Now we can take a look into events in order to clarify what is flooding the agent “013”.

 

Once you have the log is seen, you can disable logall_json and restart the Wazuh manager.

 

 

6.

Log from tail –f /var/ossec/logs/archives/archives.json (wazuh-manager)

 

 

{“timestamp”:”2018-10-17T18:06:17.33+0100″,”rule”:{“level”:7,”description”:”Host-based anomaly detection event (rootcheck).”,”id”:”510″,”firedtimes”:3352,”mail”:false,”groups”:[“ossec”,”rootcheck”],”gdpr”:[“IV_35.7.d”]},”agent”:{“id”:”013″,”na

me”:”waz03“,”ip”:”10.79.244.143″},”manager”:{“name”:”waz01“},”id”:”1539795977.2752038221″,full_log”:”File ‘/var/lib/kubelet/pods/2ff462ce-7233-11e8-8282-005056b518e6/containers/install-cni/e26aa5b1’ is owned by root and has written permissions to anyone.”,”decoder“:{“name”:”rootcheck“},”data”:{“title”:”File is owned by root and has written permissions to anyone.”,”file”:”/var/lib/kubelet/pods/2ff462ce-7233-11e8-8282-005056b518e6/containers/install-cni/e26aa5b1″},”location”:”rootcheck”}

{“timestamp”:”2018-10-17T18:06:17.35+0100″,”rule”:{“level”:7,”description”:”Host-based anomaly detection event (rootcheck).”,”id”:”510″,”firedtimes”:3353,”mail”:false,”groups”:[“ossec”,”rootcheck”],”gdpr”:[“IV_35.7.d”]},”agent”:{“id”:”013″,”name”:”waz03“,”ip”:”10.79.244.143″},”manager”:{“name”:”waz01“},”id”:”1539795977.2752038739″,”full_log”:”File ‘/var/lib/kubelet/pods/2ff462ce-7233-11e8-8282-005056b518e6/containers/install-cni/12cb9011’ is owned by root and has written permissions to anyone.”,”decoder“:{“name”:”rootcheck“},”data”:{“title”:”File is owned by root and has written permissions to anyone.”,”file”:”/var/lib/kubelet/pods/2ff462ce-7233-11e8-8282-005056b518e6/containers/install-cni/12cb9011″},”location”:”rootcheck”}

{“timestamp”:”2018-10-17T18:06:17.37+0100″,”rule”:{“level”:7,”description”:”Host-based anomaly detection event (rootcheck).”,”id”:”510″,”firedtimes”:3354,”mail”:false,”groups”:[“ossec”,”rootcheck”],”gdpr”:[“IV_35.7.d”]},”agent”:{“id”:”013″,”name”:”waz03“,”ip”:”10.79.244.143″},”manager”:{“name”:”waz01“},”id”:”1539795977.2752039257″,”full_log”:”File ‘/var/lib/kubelet/pods/2ff462ce-7233-11e8-8282-005056b518e6/containers/install-cni/4a930107’ is owned by root and has written permissions to anyone.”,”decoder“:{“name”:”rootcheck“},”data”:{“title”:”File is owned by root and has written permissions to anyone.”,”file”:”/var/lib/kubelet/pods/2ff462ce-7233-11e8-8282-005056b518e6/containers/install-cni/4a930107″},”location”:”rootcheck”}

{“timestamp”:”2018-10-17T18:06:17.40+0100″,”rule“:{“level”:7,”description”:”Host-based anomaly detection event

 

 

From the above log we can see that kubernetes is sending a lot of events to the agent causing the buffer to fill up. To solve this we particular issue from happening in future. We can disable this at the client level or the global level.

   Here you can see the number of events from rootcheck in your archives.json:

 

cat archives.json  grep rootcheck  wc l489

 

Here you can see the number of events from rootcheck and rule 510 in thearchives.json:

 

cat archives.json  grep rootcheck  grep 510  wc l489

 

Here you can see the number of events from rootcheck and rule 510 and including “/var/lib/kubelet/pods/  in your archives.json:

 

cat archives.json  grep rootcheck  grep 510  grep /var/lib/kubelet/pods/  wc l489

 

So we have two options:

 

Option 1. Edit the ossec.conf from your Wazuh agent “013”. (This is the one I did)

 

– Login using SSH into the Wazuh agent “013” instance.

– Edit the file /var/ossec/etc/ossec.conf, and look for the rootcheck block, then put a <ignore> block for that directory. 

 

<rootcheck><ignore>/var/lib/kubelet</ignore></rootcheck>

 

Restart the Wazuh agent “013”

 

# systemctl restart wazuh-agent

 

Option 2. Check in which group is your agent and edit its centralized configuration.

 

– Login using SSH into the Wazuh manager instance.

– Check the group where is agent “013”

 

# /var/ossec/bin/agent_groups -s –i 013

 

– Note down the group, example: default

– Edit the file under /var/ossec/etc/shared/default/agent.conf (replace default by the real group name, it could be different from my example), 

then add the rootcheck ignore inside the <agent_config> block, example:

 

<agent_config>  <!– Shared agent configuration here –>  <rootcheck>    <ignore>/var/lib/kuberlet</ignore>  </rootcheck></agent_config>

 

– Restart the Wazuh manager

 

# systemctl restart wazuh-manager

 

 

 

– Restart the agent on client as well

 

# systemctl restart wazuhagent

 

 

The solution #1 takes effect immediately.

 

The solution #2 will push the new configuration from the Wazuh manager to the Wazuh agent, once the agent receives it, 

it auto restarts itself automatically and then it applies the new configuration. It could take a bit more time than solution #1.

 

On a side note, you can take a look at this useful link about the agent flooding:

The above link talks about how to prevent from being flooded. 

Now the agent should show correctly in the 15min time range. If a bunch of client had the issue then you need to use ansible to send out a agent restart on all clients or setup a cron on all the machines to restart the agent every 24 hours.

 

 

Discover on the agent should also show logs

 

 

 

Ansible adhoc command or playbook.

 

Example:

 

  • ansible –i hostslinuxdevelopment -a “sudo systemctl restart wazuh-agent” –vault-password-file /etc/ansible/vaultpw.txt -u ansible_nickt -k -K

 

 

 

 

 

 

 

 

 

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *