Skip to content

Conversation

@gndrmnn
Copy link

@gndrmnn gndrmnn commented Jan 21, 2026

Consider this minimal example:

---
apiVersion: openstack.k-orc.cloud/v1alpha1
kind: SecurityGroup
metadata:
  name: security-group
spec:
  cloudCredentialsRef:
    cloudName: openstack
    secretName: cloud-config
  resource:
    name: security-group
    description: k-orc generated
    rules:
      - description: rule k-orc
        direction: ingress
        protocol: icmp
        # ethertype: IPv4
        # ethertype: IPv6
        remoteIPPrefix: "0.0.0.0/0"
        portRange:
          min: 1
          max: 42

If we select ethertype is IPv4, everything is fine and the security group rules as well as the security group is created successfully.

However, if we select ethertype is IPv6 the remoteIPPrefix field is (expectedly) rejected by openstack with a http error. The security group is already created at this point and, despite the rule error, enters a true availability state.

With this commit SecurityGroups now count their specified rules and compare them to the number of rules in their ORC status and the openstack resource. As the security group rules are only ever part of one security group, this should be enough to reliably determine, if all rules have been successfully created.

As the mismatch between remoteIPPrefix and ethertype is apparently to be solved "client-side" by ORC in the future, I am not sure, if and how to add a test for this? I think the ultimate reason why a security group rule is rejected by openstack in the future is not known yet, right?

Fixes #120

@github-actions github-actions bot added the semver:patch No API change label Jan 21, 2026
Copy link
Collaborator

@mandre mandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing the length of security group rules between the spec and the status indeed seems like the right approach for deciding whether the security group should be considered as available.

However I wonder if we should not reschedule a reconcile similarly to what we do in server or volume for example. That's very likely the reason why the flamingo job failed.

SecurityGroups now count their specified rules and compare them
to the number of rules in their ORC status and the openstack resource.
As the security group rules are only ever part of one security group,
this should be enough to reliably determine if all rules have been
successfully created.

On-behalf-of: SAP nils.gondermann@sap.com
@gndrmnn
Copy link
Author

gndrmnn commented Jan 22, 2026

However I wonder if we should not reschedule a reconcile similarly to what we do in server or volume for example. That's very likely the reason why the flamingo job failed.

Got it.

Do you want me to add tests for this change? If so, is it possible to simulate an openstack error using kuttl?

return metav1.ConditionFalse, progress.WaitingOnOpenStack(progress.WaitingOnReady, securityGroupAvailablePollingPeriod)
}

if len(resourceSpec.Rules) != len(resourceStatus.Rules) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be comparing the number of rules in the spec and in the status, because they can be different: the security group inherits the default security group rules.
Instead, we should check that the desired state (the rules in the spec) accurately represents the observed state (the rules in the status) AND account for the default security rules. The actuator should already do exactly that for the rules update.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be comparing the number of rules in the spec and in the status, because they can be different: the security group inherits the default security group rules.

I am not sure I fully understand. So every security group implicitly inherits the rules of the security group default, right? But this is not shown explicitly in openstack security group show, or the Horizon Web UI. Do I understand this correctly?

But it seems like ORC also does not currently use the (inherited) default rules to populate the status of the security group, so that should not actually be a problem. Unless this is unwanted behaviour and ORC is supposed to do that in the future, in which case the change in this PR will indeed break.

Instead, we should check that the desired state (the rules in the spec) accurately represents the observed state (the rules in the status) AND account for the default security rules. The actuator should already do exactly that for the rules update.

Yes, my understanding is that the actuator will take care of making sure the rules are matching from a content perspective. That being said, the rules should only become part of the security group resource status if and when they match. So to update the Available status it should be enough to make sure the number of rules is the same, without duplicating the validation logic of the rule's content, or am I not seeing the problem right now?

}
}

// SecurityGroup is available as soon as it exists
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thoughts, I wonder if this is really what we want: the security group is indeed available right away, meaning we can do operations on it. The Available condition is accurate.

The Progressing condition, on the other hand, should reflect that we still have security group rules being provisioned, and that the desired state does not yet match the observed state.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue #120 states that the current way to represent the Available status in security groups is undesired. As far as I understand, security group rules can only ever be part of a single security group. And they can not be cross-referenced. So I would agree with #120 in that a security group is only "available and matches the spec" when the security groups are actually created.

However, this design decision has to be decided by the project according to it's own goals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver:patch No API change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SecurityGroup incorrectly marked available if rules fail

2 participants