Skip to content

Gauge for FilesystemIsReadOnly not downgraded to 0 after fixing the problem #474

@sharonosbourne

Description

@sharonosbourne

The problem occurred when filesystem went to read only mode. That was fixed, but still in the metrics I was able to see the counter and gauge set up to 1.
I conducted a test and multiple times injected the FileSystemIsReadOnly to the /dev/kmsg (https://github.com/kubernetes/node-problem-detector/blob/master/config/kernel-monitor.json):

1 log_monitor.go:160] New status generated: &{Source:kernel-monitor Events:[{Severity:info Timestamp:2020-10-08 06:44:16.09315274 +0000 UTC m=+1331754.148888064 Reason:FilesystemIsReadOnly Message:Node condition ReadonlyFilesystem is now: True, reason: FilesystemIsReadOnly}] Conditions:[{Type:KernelDeadlock Status:False Transition:2020-09-22 20:48:21.98500453 +0000 UTC m=+0.040739839 Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status:True Transition:2020-10-08 06:44:16.09315274 +0000 UTC m=+1331754.148888064 Reason:FilesystemIsReadOnly Message:Remounting filesystem read-only}]}

Still the metrics were shown as 1 and it did not downgraded to 0. Even the the issue with ro filesystem was fixed, still the metric was 1:

problem_counter{reason="FilesystemIsReadOnly"} 1
problem_gauge{reason="FilesystemIsReadOnly",type="ReadonlyFilesystem"} 1

As a workaround the pod was deleted and after that metrics were reset to 0.
What is the reason of that behaviour? The type "permanent"? Is deleting a pod the only solution?

kernel-monitor.json

	{
		"type": "permanent",
		"condition": "ReadonlyFilesystem",
		"reason": "FilesystemIsReadOnly",
		"pattern": "Remounting filesystem read-only"
	}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions