This was a self-challenge during 2024's carnival holiday's weekend, and incremented to become an optional exercise for [an MBA in AI for Businesses](https://exame.com/faculdade/mba/mba-em-inteligencia-artificial-para-negocios) - proof-of-concept maturity.
This repository reimplements KMeans in plain C# in the form of a externally-schedulable job¹, and also a minimal effort to make it a Spring Boot 3 API². This also provides some form of outlier detection.
¹ Deployable via crontab, your internal corporate solution, AWS Batch, Azure Batch, GCP Preemptible VMs, Oracle Burstable Instances, or anything else that runs a commandline at a scheduled time. May require tweaking to fit your needs.<br>
² Deployable in your coporate TomCat, WildFly, Docker, Kubernetes, CloudFlare Workers, AWS Lambda, GCP Cloud Functions, or anything else you use to serve a Java Web API. May require tweaking to fit your needs.
| Kernel | Linux 6.7.4-zen1 | Darwin 23.0.0 | Linux 6.7.0-zen3 | Linux 6.6.8-arch-1 |
Therefore, we should should see some memory busses saturated.
³ There are parallelization paths, and they are explicit by their prefix “10x”, but I believe that in a corporate environment there would be many jobs running in parallel and the predictability of a stable resource allocation would have a greater importance.
### RAM resource
Memory was measured by watching the numbers on the resource monitor on each system. Under Linux, that means `htop` and on Mac that means `Activity Monitor`.
- KMeans is an old algorithm, known [since at least 1956](https://stats.stackexchange.com/a/82740);
- WSS (Within-Cluster Sum of Squares) is just a fancy name for a specific kind of variance, which the latter exists [since at least 1923](https://link.springer.com/chapter/10.1007/978-1-4612-6079-0_4);
- Silhouette, the newest of it all, was [proposed in 1987](<https://en.wikipedia.org/wiki/Silhouette_(clustering)>).
- Programming-wise:
- C# is an old programming language, available [since 2002](https://learn.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-version-history#c-version-10-1);
- Spring Boot is an old web framework, available [since 2014](https://spring.io/blog/2014/04/01/spring-boot-1-0-ga-released);
- Java is an old programming language, available [since 1996](https://en.wikipedia.org/wiki/Java_version_history#Release_table).
If you “innovate” using these technologies in your business, it's just a century worth of technical debts that you are removing from your outworn processes.