{
    "cells": [
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# ML Practice Series: Module 05 - K-Means Clustering\n",
                "\n",
                "Welcome to the final module of this basic series! We are exploring **Unsupervised Learning** with **K-Means Clustering**.\n",
                "\n",
                "### Objectives:\n",
                "1. **Unsupervised Learning**: Pattern discovery without labels.\n",
                "2. **K-Means**: How the algorithm groups data.\n",
                "3. **Elbow Method**: Deciding the number of clusters (K).\n",
                "\n",
                "---"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## 1. Setup\n",
                "We will generate a synthetic dataset for this exercise to clearly see the clusters."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import pandas as pd\n",
                "import numpy as np\n",
                "import matplotlib.pyplot as plt\n",
                "import seaborn as sns\n",
                "from sklearn.cluster import KMeans\n",
                "from sklearn.datasets import make_blobs\n",
                "\n",
                "# Generate synthetic data\n",
                "X, _ = make_blobs(n_samples=500, centers=4, cluster_std=1.0, random_state=42)\n",
                "df = pd.DataFrame(X, columns=['Feature 1', 'Feature 2'])\n",
                "\n",
                "plt.scatter(df['Feature 1'], df['Feature 2'], s=30, alpha=0.5)\n",
                "plt.title(\"Original Data (Unlabeled)\")\n",
                "plt.show()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## 2. K-Means Implementation\n",
                "\n",
                "### Task 1: Find Optimal K (Elbow Method)\n",
                "Calculate inertia (Within-Cluster Sum of Squares) for K values from 1 to 10."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# YOUR CODE HERE\n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<details>\n",
                "<summary><b>Click to see Solution</b></summary>\n",
                "\n",
                "```python\n",
                "inertia = []\n",
                "for k in range(1, 11):\n",
                "    kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)\n",
                "    kmeans.fit(X)\n",
                "    inertia.append(kmeans.inertia_)\n",
                "\n",
                "plt.plot(range(1, 11), inertia, 'bx-')\n",
                "plt.xlabel('K values')\n",
                "plt.ylabel('Inertia')\n",
                "plt.title('Elbow Method')\n",
                "plt.show()\n",
                "```\n",
                "</details>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Task 2: Fit K-Means\n",
                "From the elbow plot, choose the best K (looks like 4) and fit the model."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# YOUR CODE HERE\n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<details>\n",
                "<summary><b>Click to see Solution</b></summary>\n",
                "\n",
                "```python\n",
                "kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)\n",
                "df['cluster'] = kmeans.fit_predict(X)\n",
                "```\n",
                "</details>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Task 3: Visualize Clusters\n",
                "Scatter plot again, but color points by their assigned cluster."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# YOUR CODE HERE\n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<details>\n",
                "<summary><b>Click to see Solution</b></summary>\n",
                "\n",
                "```python\n",
                "plt.scatter(df['Feature 1'], df['Feature 2'], c=df['cluster'], cmap='viridis', s=30)\n",
                "plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], c='red', s=200, marker='X', label='Centroids')\n",
                "plt.legend()\n",
                "plt.title(\"Clustered Data\")\n",
                "plt.show()\n",
                "```\n",
                "</details>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "--- \n",
                "### Congratulations! \n",
                "You've completed the foundational Machine Learning practice series. \n",
                "You now have hands-on experience with:\n",
                "1. EDA & Feature Engineering\n",
                "2. Linear Regression\n",
                "3. Logistic Regression\n",
                "4. Decision Trees & Random Forests\n",
                "5. K-Means Clustering\n",
                "\n",
                "Keep practicing with new datasets!"
            ]
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "Python 3",
            "language": "python",
            "name": "python3"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.8.0"
        }
    },
    "nbformat": 4,
    "nbformat_minor": 4
}